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Speeding up the HMC: QCD with Clover-Improved Wilson Fermions 
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We apply a recent proposal to speed up the Hybrid-Monte-Carlo simulation of systems with dynamical fermions 
to two flavor QCD with clover-improvement. For our smallest quark masses we see a speed-up of more than a 
factor of two compared with the standard algorithm. 



1. Introduction 

It is clear that simulation algorithms that are 
used today will not be able to reach physical val- 
ues of the quark masses. The scaling behavior 
(see e.g. of the algorithms predict enormous 
costs for simulations at quark masses as light as 
the up- and down-quarks. To reach this physical 
point, extrapolations using xPT (chiral perturba- 
tion theory) have to be used. However contact to 
xPT seems to be happening at rather small values 
of the quark masses themselves. Any progress to 
render simulations easier, when approaching the 
small quark mass regime will help therefore to 
reach overlap between xPT and lattice QCD, al- 
lowing for a safe extrapolation to physical quark 
masses. 

Still the HMC (Hybrid-Monte-Carlo) algo- 
rithm 1^] and its variants are the methods of 
choice in large scale simulations of lattice QCD 
with dynamical Wilson-fermions. One obstacle in 
going to light quarks is that the step size of the in- 
tegration scheme has to be reduced to maintain a 
constant acceptance rate. In ref. we proposed 
to split the fermion matrix into two factors and 
to introduce a pseudo-fermion field for both fac- 
tors. The numerical study of the two dimensional 
Schwinger model showed that the step-size can be 
enlarged and thus the computational effort can be 
reduced substantially this way. In ref. Q we pre- 
sented first results for lattice QCD with two fla- 
vors of Wilson fermions and clover improvement 
|5| . Here we present new results for lattices up to 
16^ X 24 at /3 = 5.2 and quark masses down to 
wps/my ~ 0.686. 



2. The pseudo-fermion action 

The partition function of lattice QCD with two 
degenerate flavors of dynamical fermions is given 
by 

Z = J n[U]exp{-SG[U]) detM[Uf , (1) 

where Sg[U] is the Wilson plaquette action. In 
our case, M[U] is the Wilson fermion matrix with 
0(a) (clover)-improvement. For the details see 
e.g. ref. An important feature of our proposal 
1^ is that it can be applied on top of standard 
preconditioning. Here, we use even-odd precon- 
ditioning as it is detailed in ref. The fermion 
matrix can be written as 

Ti r / lee ^" -^ee l^-^^eo \ fn\ 

" ^ -KMoe loo +Too ) ' ^ ' 

where e refers to even sites and o to odd sites 
of the lattice. The determinant of the fermion 
matrix can now be written as 

detM cx det(lee + Tee) detM , (3) 

where M = + Too - M^eilee + Tee)-^Meo. 

In the following, we shall consider the Hermitian 
matrix Q — cq^^AI , where we have set co = 1 
in the simulations discussed below. The effective 
action for the standard HMC simulation reads ||] 

Seff[U,cl)\^]=SG[U]+Sdet[U]+SF[U,cl)\<j>],{A) 

with Sdet[U] = -2Trlog(l + Tee) and 
SF[U,<j>\(l)] — (t>^Q^'^(f) . In our study we keep 
5*0 [J7] and Sdet\U\ in their standard form. How- 
ever, S f[U , (j)] is replaced by alternative ex- 
pressions: We split the fermion matrix Q into 
two factors. The determinant of each factor is 
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estimated by an integral over pseudo-fermion 
fields: 

detQ^ = detWW'' det[W-^Q][W~^Q]^ cx (5) 
J B4>\j D01 j d4 j D02 e^v{-SFi- Sf2) 
where 

Sfi = 0l(wt)~Vi , 

5f2 = 4 ([W^-iQ][M^-iQ]t)"V2 . (6) 

We considered two choices for W: 

W = Q + n^ , (7) 

which is the original proposal of ref. . Below we 
shall show only results for this choice. In addition 
we considered W — Q -\-ip, which was first tested 
in ref. Also in ref. 0, results for this choice 
are presented. It turns out that both choices for 
W give a similar performance improvement of the 
algorithm. 

An important feature of our approach is that 
the variation of the modified pseudo-fermion ac- 
tion can be computed as easily as in the standard 
case. The variation of Spi is essentially the same 
as for the standard pseudo-fermion action. One 
only has to replace Q by W . For the second part 
of the pseudo-fermion action we get 

5Sf2 - - xUQY ~ Y'^ SQ'' X 

+ 6W (j)2 + 4 ^W^^ ^ (8) 
with the vectors 

X ^ [qQ^^Y^ W (t>2 , Y = Q-^W(t>2 . (9) 

3. Numerical results 

We have tested our modified algorithm at pa- 
rameters that had been studied by UKQCD be- 
fore 1^. We have performed simulations at /? = 
5.2 and csw = 1-76. Note that csw — 1-76 
was a preliminary result for the improvement 
coefficient, while the final analysis resulted in 
csw = 2.0171 for /? = 5.2 §. We have stud- 
ied n = 0.137, 0.139, 0.1395 and 0.1398. These 
values of n correspond to m.^/mp « 0.856, 0.792, 



0.715 and 0.686, respectively We apphed peri- 
odic boundary conditions in all lattice directions, 
except for anti-periodic boundary conditions in 
time-direction for the fermion- fields. 

In addition to the standard leap-frog scheme, 
we used a scheme with a reduced coefficient of 
the 0{5t'^) corrections proposed by Sexton and 
Weingarten (see eq. (6.4) of ref. fl^). In order 
to eliminate the influence of the gauge action on 
the step size of the integration scheme, we used 
the split of time scale as proposed in ref. [l^ . 
In particular, we have computed the variation of 
the gauge action four times as frequent as for the 
fermion action. As length of the trajectory we 
have always chosen r = 1. 

In our study, we applied the BiCGstab as 
solver. We have stopped the solver when the 
iterated residual r becomes smaller than a cer- 
tain bound. To compute the action for the ac- 
cept/reject step at the end of the trajectory we 
required < 10"^". Note that we take the abso- 
lute residual and not the relative. To compute the 
variation of the fermionic action, we required a 
less strict criterion < E? . It turns out that for 
B? smaller than some threshold the acceptance 
rate virtually does not depend on E? , while it 
rapidly drops to zero as E? becomes larger than 
this threshold. In the simulations reported below, 
is chosen 10~^ times the threshold or smaller. 

In table |l| we show results for the acceptance 
rate for a 8^ x 24 lattice at k = 0.137. Af- 
ter equilibration we generated 200 trajectories 
for each parameter set. The standard pseudo- 
fermion action is given by p = 0. First of all, 
for the same step size 5t the acceptance with the 
modified pseudo-fermion action is higher than for 
the standard action. The maximum of the ac- 
ceptance rate is rather shallow for both integra- 
tion schemes. It is located ai p k, 0.5. Next 
we performed longer simulations (with 6000 to 
8000 trajectories each) for the standard pseudo- 
fermion action {p — 0) and the optimal p — 0.5 in 
the modified pseudo-fermion case. This time we 
tuned the step-size such that Pace ~ 0.8. For the 
standard pseudo-fermion action and the leap-frog 
scheme we get with 5t — 0.025 the acceptance 
rate Pace = 0.793(3). The leap-frog scheme with 
the modified pseudo-fermion action at p = 0.5 
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Table 1 

Acceptance rates Pace for the 8^ x 24 lattice at 
P = 5.2, K — 0.137 and Csw = 1-76. Each run 
consists of 200 trajectories. The leap-frog runs 
(L) were performed with the step size St — 0.02. 
The runs with the improved scheme (S) with St = 
0.05. 



p 






0.0 


0.856(7) 


0.876(10) 


0.1 


0.914(8) 




0.3 


0.940(5) 


0.968(4) 


0.5 


0.948(5) 


0.973(2) 


0.6 


0.944(4) 


0.973(2) 


0.7 


0.934(4) 


0.969(3) 


1.0 


0.938(7) 


0.957(3) 



gives Pace = 0.770(3) for St = 0.04. Using the 
improved integration scheme, combined with the 
modified pseudo-fermion action even a step-size 
as large as St = 0.1 gives Pace = 0.883(2). Note 
however that for the improved scheme the varia- 
tion of the action has to be computed twice per 
time-step and for the leap-frog only once. For 
these more extended runs we have computed in- 
tegrated autocorrelation times for the value of the 
plaquette and the number of iterations that are 
needed by the solver. It turned out that within 
error-bars the autocorrelation times are the some 
for all three runs reported above. Hence the 
modification of the pseudo-fermion action has no 
detectable effect on autocorrelation times (mea- 
sured in units of trajectories). 

In table ^ we give our results for the 16^^ x 24 lat- 
tice at our largest value of k. Starting after equi- 
libration, we performed 100 trajectories for each 
parameter set. Here the maximum of the accep- 
tance rate has become sharper than for the 8'^ x 24 
lattice at K = 0.137. For the leap-frog scheme the 
largest acceptance rate is still reached at p ~ 0.5, 
while for the scheme of Sexton and Weingarten 
the optimum is shifted down to p w 0.2. It is in- 
teresting to note that the improved scheme profits 
much more from the modified pseudo-fermion ac- 
tion than the leap-frog scheme. In the case of the 
leap-frog scheme the step-size can only be dou- 
bled, while for the improved scheme the step-size 
can be more than tripled. In the last column of 
table 2 we give the total number of applications 



Table 2 

Results for a 16^ x 24 lattice at (3 = 5.2, Csw = 
1.76 and k = 0.1398. gives the total number 
of applications of per trajectory. 



scheme 


P 


St 


Pace 




L 


0. 


0.01 


0.77(3) 


17000 


L 


0.05 


0.02 


0.62(3) 


11700 


L 


0.15 


0.02 


0.75(3) 


10700 


L 


0.3 


0.02 


0.76(2) 


10100 


L 


0.5 


0.02 


0.78(2) 


9500 


L 


1.0 


0.02 


0.64(4) 


9300 


S 


0. 


0.02 


0.64(4) 


16500 


S 


0.05 


0.066.. 


0.35(4) 


7500 


S 


0.1 


0.066.. 


0.67(3) 


6500 


S 


0.15 


0.066.. 


0.72(3) 


6800 


S 


0.2 


0.066.. 


0.74(3) 


6900 


S 


0.4 


0.066.. 


0.49(3) 


6100 


S 


0.5 


0.05 


0.67(4) 


7700 



of per trajectory. This number is proportional 
to the numerical effort of the simulation. Com- 
paring the leap-frog scheme at p = with the 
improved scheme at p = 0.2, we find a reduction 
of the numerical cost by a factor of roughly 2.4. 

REFERENCES 

1. T. Lippert, Nucl.Phys.B (Proc.Suppl.) 106 
(2002) 193. 

2. S. Duane, A.D. Kennedy, B.J. Pendleton and 
D. Roweth, Phys.Lett. B 195 (1987) 216. 

3. M.Hasenbusch, Phys.Lett. B 519 (2001) 177. 

4. M. Hasenbusch and K. Jansen, Nucl.Phys.B 
(Proc.Suppl.) 106 (2002) 1076. 

5. B. Sheikholcslami and R. Wohlert, Nucl. 
Phys. B 259 (1985) 572. 

6. K. Jansen and C. Liu, Comput. Phys. Com- 
mun. 99 (1997) 221. 

7. M. Delia Morte et al.. Running quark mass 
in two flavor QCD, this conference, hep- 



lat/0209025 



8. Z. Sroczynski, Ph.D. thesis, Edinburgh 
(1998). 

9. K. Jansen and R. Sommer, Nucl. Phys. B 530 
(1998) 185. 

10. J.C. Sexton and D.H. Weingarten, Nucl. Phys. 
B 380 (1992) 665. 



